counterfactual analysis
Beyond the Black Box: Demystifying Multi-Turn LLM Reasoning with VISTA
Zhang, Yiran, Lin, Mingyang, Dras, Mark, Naseem, Usman
Recent research has increasingly focused on the reasoning capabilities of Large Language Models (LLMs) in multi-turn interactions, as these scenarios more closely mirror real-world problem-solving. However, analyzing the intricate reasoning processes within these interactions presents a significant challenge due to complex contextual dependencies and a lack of specialized visualization tools, leading to a high cognitive load for researchers. To address this gap, we present VIST A, an web-based Visual Interactive System for Textual Analytics in multi-turn reasoning tasks. VIST A allows users to visualize the influence of context on model decisions and interactively modify conversation histories to conduct "what-if" analyses across different models. Furthermore, the platform can automatically parse a session and generate a reasoning dependency tree, offering a transparent view of the model's step-by-step logical path. By providing a unified and interactive framework, VIST A significantly reduces the complexity of analyzing reasoning chains, thereby facilitating a deeper understanding of the capabilities and limitations of current LLMs. The platform is open-source and supports easy integration of custom benchmarks and local models.
From Promising Capability to Pervasive Bias: Assessing Large Language Models for Emergency Department Triage
Lee, Joseph, Shang, Tianqi, Baik, Jae Young, Duong-Tran, Duy, Yang, Shu, Li, Lingyao, Shen, Li
Large Language Models (LLMs) have shown promise in clinical decision support, yet their application to triage remains underexplored. We systematically investigate the capabilities of LLMs in emergency department triage through two key dimensions: (1) robustness to distribution shifts and missing data, and (2) counterfactual analysis of intersectional biases across sex and race. We assess multiple LLM-based approaches, ranging from continued pre-training to in-context learning, as well as machine learning approaches. Our results indicate that LLMs exhibit superior robustness, and we investigate the key factors contributing to the promising LLM-based approaches. Furthermore, in this setting, we identify gaps in LLM preferences that emerge in particular intersections of sex and race. LLMs generally exhibit sex-based differences, but they are most pronounced in certain racial groups. These findings suggest that LLMs encode demographic preferences that may emerge in specific clinical contexts or particular combinations of characteristics.
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- South America > Brazil (0.04)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- (2 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
Counterfactual optimization for fault prevention in complex wind energy systems
Carrizosa, Emilio, Fischetti, Martina, Haaker, Roshell, Morales, Juan Miguel
Machine Learning models are increasingly used in businesses to detect faults and anomalies in complex systems. In this work, we take this approach a step further: beyond merely detecting anomalies, we aim to identify the optimal control strategy that restores the system to a safe state with minimal disruption. We frame this challenge as a counterfactual problem: given a Machine Learning model that classifies system states as either "good" or "anomalous," our goal is to determine the minimal adjustment to the system's control variables (i.e., its current status) that is necessary to return it to the "good" state. To achieve this, we leverage a mathematical model that finds the optimal counterfactual solution while respecting system-specific constraints. Notably, most counterfactual analysis in the literature focuses on individual cases where a person seeks to alter their status relative to a decision made by a classifier--such as for loan approval or medical diagnosis. Our work addresses a fundamentally different challenge: optimizing counterfactuals for a complex energy system, specifically an offshore wind turbine oil-type transformer. This application not only advances counterfactual optimization in a new domain but also opens avenues for broader research in this area. Our tests on real-world data provided by our industrial partner show that our methodology easily adapts to user preferences and brings savings in the order of 3 million e per year in a typical farm. Introduction Energy systems are becoming increasingly more complex, making it more challenging--and more critical--to detect faults early and develop strategies to mitigate them. In this context, Machine Learning (ML) techniques have become an industry standard for early fault detection [16]. Energy companies can monitor various sensor readings from the turbines and apply ML methods to identify potential issues with components. In this paper, we define a fault (or faulty state) as a condition where a component is in an unsafe status, while an anomaly refers to any irregularity that is not necessarily dangerous. Note that faults are a subset of anomalies. When a fault is detected, a controller is immediately activated to prevent severe damage to the turbine. Machine Learning models can detect anomalies in advance, providing companies with a window of time to intervene before faults occur.
- Europe > Spain > Andalusia > Seville Province > Seville (0.04)
- Europe > Spain > Andalusia > Málaga Province > Málaga (0.04)
- Europe > Northern Europe (0.04)
- (2 more...)
- Research Report (0.82)
- Overview (0.67)
Beyond Patterns: Harnessing Causal Logic for Autonomous Driving Trajectory Prediction
Wang, Bonan, Liao, Haicheng, Wang, Chengyue, Rao, Bin, Guan, Yanchen, Yu, Guyang, Zhang, Jiaxun, Lai, Songning, Xu, Chengzhong, Li, Zhenning
Accurate trajectory prediction has long been a major challenge for autonomous driving (AD). Traditional data-driven models predominantly rely on statistical correlations, often overlooking the causal relationships that govern traffic behavior. In this paper, we introduce a novel trajectory prediction framework that leverages causal inference to enhance predictive robustness, generalization, and accuracy. By decomposing the environment into spatial and temporal components, our approach identifies and mitigates spurious correlations, uncovering genuine causal relationships. We also employ a progressive fusion strategy to integrate multimodal information, simulating human-like reasoning processes and enabling real-time inference. Evaluations on five real-world datasets--ApolloScape, nuScenes, NGSIM, HighD, and MoCAD--demonstrate our model's superiority over existing state-of-the-art (SOT A) methods, with improvements in key metrics such as RMSE and FDE. Our findings highlight the potential of causal reasoning to transform trajectory prediction, paving the way for robust AD systems.
- Information Technology > Artificial Intelligence > Natural Language (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
- Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.69)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (0.48)
Enhancing Metabolic Syndrome Prediction with Hybrid Data Balancing and Counterfactuals
Shah, Sanyam Paresh, Mamun, Abdullah, Soumma, Shovito Barua, Ghasemzadeh, Hassan
Metabolic Syndrome (MetS) is a cluster of interrelated risk factors that significantly increases the risk of cardiovascular diseases and type 2 diabetes. Despite its global prevalence, accurate prediction of MetS remains challenging due to issues such as class imbalance, data scarcity, and methodological inconsistencies in existing studies. In this paper, we address these challenges by systematically evaluating and optimizing machine learning (ML) models for MetS prediction, leveraging advanced data balancing techniques and counterfactual analysis. Multiple ML models, including XGBoost, Random Forest, TabNet, etc., were trained and compared under various data balancing techniques such as random oversampling (ROS), SMOTE, ADASYN, and CTGAN. Additionally, we introduce MetaBoost, a novel hybrid framework that integrates SMOTE, ADASYN, and CTGAN, optimizing synthetic data generation through weighted averaging and iterative weight tuning to enhance the model's performance (achieving up to a 1.87% accuracy improvement over individual balancing techniques). A comprehensive counterfactual analysis is conducted to quantify the feature-level changes required to shift individuals from high-risk to low-risk categories. The results indicate that blood glucose (50.3%) and triglycerides (46.7%) were the most frequently modified features, highlighting their clinical significance in MetS risk reduction. Additionally, probabilistic analysis shows elevated blood glucose (85.5% likelihood) and triglycerides (74.9% posterior probability) as the strongest predictors. This study not only advances the methodological rigor of MetS prediction but also provides actionable insights for clinicians and researchers, highlighting the potential of ML in mitigating the public health burden of metabolic syndrome.
- North America > United States > Arizona > Maricopa County > Tempe (0.04)
- North America > United States > Arizona > Maricopa County > Phoenix (0.04)
- Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Integrating Probabilistic Trees and Causal Networks for Clinical and Epidemiological Data
Zahoor, Sheresh, Liò, Pietro, Dias, Gaël, Hasanuzzaman, Mohammed
Healthcare decision-making requires not only accurate predictions but also insights into how factors influence patient outcomes. While traditional Machine Learning (ML) models excel at predicting outcomes, such as identifying high risk patients, they are limited in addressing what-if questions about interventions. This study introduces the Probabilistic Causal Fusion (PCF) framework, which integrates Causal Bayesian Networks (CBNs) and Probability Trees (PTrees) to extend beyond predictions. PCF leverages causal relationships from CBNs to structure PTrees, enabling both the quantification of factor impacts and simulation of hypothetical interventions. PCF was validated on three real-world healthcare datasets i.e. MIMIC-IV, Framingham Heart Study, and Diabetes, chosen for their clinically diverse variables. It demonstrated predictive performance comparable to traditional ML models while providing additional causal reasoning capabilities. To enhance interpretability, PCF incorporates sensitivity analysis and SHapley Additive exPlanations (SHAP). Sensitivity analysis quantifies the influence of causal parameters on outcomes such as Length of Stay (LOS), Coronary Heart Disease (CHD), and Diabetes, while SHAP highlights the importance of individual features in predictive modeling. By combining causal reasoning with predictive modeling, PCF bridges the gap between clinical intuition and data-driven insights. Its ability to uncover relationships between modifiable factors and simulate hypothetical scenarios provides clinicians with a clearer understanding of causal pathways. This approach supports more informed, evidence-based decision-making, offering a robust framework for addressing complex questions in diverse healthcare settings.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > France (0.04)
- (6 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
What Is a Counterfactual Cause in Action Theories?
Since the proposal by Halpern and Pearl, reasoning about actual causality has gained increasing attention in artificial intelligence, ranging from domains such as model-checking and verification to reasoning about actions and knowledge. More recently, Batusov and Soutchanski proposed a notion of actual achievement cause in the situation calculus, amongst others, they can determine the cause of quantified effects in a given action history. While intuitively appealing, this notion of cause is not defined in a counterfactual perspective. In this paper, we propose a notion of cause based on counterfactual analysis. In the context of action history, we show that our notion of cause generalizes naturally to a notion of achievement cause. We analyze the relationship between our notion of the achievement cause and the achievement cause by Batusov and Soutchanski. Finally, we relate our account of cause to Halpern and Pearl's account of actual causality. Particularly, we note some nuances in applying a counterfactual viewpoint to disjunctive goals, a common thorn to definitions of actual causes.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- (13 more...)
Navigating Tomorrow: Reliably Assessing Large Language Models Performance on Future Event Prediction
Predicting future events is an important activity with applications across multiple fields and domains. For example, the capacity to foresee stock market trends, natural disasters, business developments, or political events can facilitate early preventive measures and uncover new opportunities. Multiple diverse computational methods for attempting future predictions, including predictive analysis, time series forecasting, and simulations have been proposed. This study evaluates the performance of several large language models (LLMs) in supporting future prediction tasks, an under-explored domain. We assess the models across three scenarios: Affirmative vs. Likelihood questioning, Reasoning, and Counterfactual analysis. For this, we create a dataset1 by finding and categorizing news articles based on entity type and its popularity. We gather news articles before and after the LLMs training cutoff date in order to thoroughly test and compare model performance. Our research highlights LLMs potential and limitations in predictive modeling, providing a foundation for future improvements.
- Europe > Austria > Tyrol > Innsbruck (0.05)
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Asia > Singapore (0.04)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)
What If We Had Used a Different App? Reliable Counterfactual KPI Analysis in Wireless Systems
Hou, Qiushuo, Park, Sangwoo, Zecchin, Matteo, Cai, Yunlong, Yu, Guanding, Simeone, Osvaldo
In modern wireless network architectures, such as Open Radio Access Network (O-RAN), the operation of the radio access network (RAN) is managed by applications, or apps for short, deployed at intelligent controllers. These apps are selected from a given catalog based on current contextual information. For instance, a scheduling app may be selected on the basis of current traffic and network conditions. Once an app is chosen and run, it is no longer possible to directly test the key performance indicators (KPIs) that would have been obtained with another app. In other words, we can never simultaneously observe both the actual KPI, obtained by the selected app, and the counterfactual KPI, which would have been attained with another app, for the same network condition, making individual-level counterfactual KPIs analysis particularly challenging. This what-if analysis, however, would be valuable to monitor and optimize the network operation, e.g., to identify suboptimal app selection strategies. This paper addresses the problem of estimating the values of KPIs that would have been obtained if a different app had been implemented by the RAN. To this end, we propose a conformal-prediction-based counterfactual analysis method for wireless systems that provides reliable error bars for the estimated KPIs, despite the inherent covariate shift between logged and test data. Experimental results for medium access control-layer apps and for physical-layer apps demonstrate the merits of the proposed method.
- North America > United States > New York > New York County > New York City (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- (9 more...)
Performance-Driven QUBO for Recommender Systems on Quantum Annealers
Niu, Jiayang, Li, Jie, Deng, Ke, Sanderson, Mark, Ren, Yongli
We propose Counterfactual Analysis Quadratic Unconstrained Binary Optimization (CAQUBO) to solve QUBO problems for feature selection in recommender systems. CAQUBO leverages counterfactual analysis to measure the impact of individual features and feature combinations on model performance and employs the measurements to construct the coefficient matrix for a quantum annealer to select the optimal feature combinations for recommender systems, thereby improving their final recommendation performance. By establishing explicit connections between features and the recommendation performance, the proposed approach demonstrates superior performance compared to the state-of-the-art quantum annealing methods. Extensive experiments indicate that integrating quantum computing with counterfactual analysis holds great promise for addressing these challenges.
- North America > United States > New York > New York County > New York City (0.04)
- Africa > Comoros > Grande Comore > Moroni (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)